Interpretable Distribution Features with Maximum Testing Power

نویسندگان

  • Wittawat Jitkrittum
  • Zoltán Szabó
  • Kacper P. Chwialkowski
  • Arthur Gretton
چکیده

Two semimetrics on probability distributions are proposed, given as the sum of differences of expectations of analytic functions evaluated at spatial or frequency locations (i.e, features). The features are chosen so as to maximize the distinguishability of the distributions, by optimizing a lower bound on test power for a statistical test using these features. The result is a parsimonious and interpretable indication of how and where two distributions differ locally. We show that the empirical estimate of the test power criterion converges with increasing sample size, ensuring the quality of the returned features. In real-world benchmarks on highdimensional text and image data, linear-time tests using the proposed semimetrics achieve comparable performance to the state-of-the-art quadratic-time maximum mean discrepancy test, while returning human-interpretable features that explain the test results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

two- sided power distribution

In this paper, a new family of distributions with many applications in financial engineering have been introduced. This distribution contains important statistical distributions such as the triangular, exponential and uniform distribution. Initially considered a special case of this distribution And then survey The important features of it. How to calculate maximum likelihood estimates are pres...

متن کامل

Placement and Sizing of Various Renewable Generations in Distribution Networks with Consideration of Generation Uncertainties using Point Estimate Method

Abstract: Deploying Distributed Generation (DG) units has increased due to yearly increase of electric energy demand and technological advancements beyond Smart Grid. Although, DGs offer several advantages such as reducing economic costs and environmental impacts, the operation of these units in power systems creates several problems. In this paper, optimal allocation and sizing of DG units in ...

متن کامل

Power Normal-Geometric Distribution: Model, Properties and Applications

In this paper, we introduce a new skewed distribution of which normal and power normal distributions are two special cases. This distribution is obtained by taking geometric maximum of independent identically distributed power normal random variables. We call this distribution as the power normal--geometric distribution. Some mathematical properties of the new distribution are presented. Maximu...

متن کامل

Inference on Pr(X > Y ) Based on Record Values From the Power Hazard Rate Distribution

In this article, we consider the problem of estimating the stress-strength reliability $Pr (X > Y)$ based on upper record values when $X$ and $Y$ are two independent but not identically distributed random variables from the power hazard rate distribution with common scale parameter $k$. When the parameter $k$ is known, the maximum likelihood estimator (MLE), the approximate Bayes estimator and ...

متن کامل

Three Dimensional Transient Numerical Modeling of Temperature Distribution and Output Power in Photovoltaic Module

According to the effect of temperature on the output power of a photovoltaic module, this research tries to calculate the temperature distribution in a photovoltaic module by numerical solving of the energy balance equations. Therefore, its output power can be accurately predicted. For this purpose, several photovoltaic modules are modeled in detail in the COMSOL software. A new method for calc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016